This vignette is described as a sample output using hubVis package, one of the applications of Hubverse. By using it the model projections(with quantile data) will be plotted.

For more information about the Hubverse standard format, please refer to the HubDocs website.

library(hubVis)
library(hubUtils)

Plots are generated by applying:

Load and Filter Data

The two datasets will be used as an example:

Load data

projection_path <- "./sample/teamsam-modelple/2023-11-19-teamsam-modelple.parquet"
projection_data <- arrow::read_parquet(projection_path)
head(projection_data)
#> # A tibble: 6 × 8
#>   origin_date target horizon location age_group output_type output_type_id value
#>   <date>      <chr>    <int> <chr>    <chr>     <chr>                <dbl> <dbl>
#> 1 2023-11-19  inc h…       1 US       0-0.99    quantile             0.01   3.83
#> 2 2023-11-19  inc h…       1 US       0-0.99    quantile             0.025  5.73
#> 3 2023-11-19  inc h…       1 US       0-0.99    quantile             0.05   8.90
#> 4 2023-11-19  inc h…       1 US       0-0.99    quantile             0.1   15.2 
#> 5 2023-11-19  inc h…       1 US       0-0.99    quantile             0.15  21.6 
#> 6 2023-11-19  inc h…       1 US       0-0.99    quantile             0.2   27.9

truth_path <- "../target-data/rsvnet_hospitalization.csv"
truth_data <- read.csv(truth_path, stringsAsFactors = FALSE)
head(truth_data)
#>   location       date age_group    target     value population
#> 1       47 2016-10-08    18-130 rate hosp  0.200000    5126526
#> 2       47 2016-10-08    18-130  inc hosp 10.253052    5126526
#> 3       41 2016-10-08    65-130 rate hosp  0.400000     681767
#> 4       41 2016-10-08    65-130  inc hosp  2.727068     681767
#> 5       27 2016-10-08     18-49 rate hosp  0.000000    2280031
#> 6       27 2016-10-08     18-49  inc hosp  0.000000    2280031

Data Preparation

The model output data in the projection_data object follows the structure of the model_out_tbl class. This dataset is converted to a model_out_tbl object after being read-in above. In addition to the standard requirements for this class, the plot_step_ahead_model_output() function in hubVis requires that the dataset have a column whose value corresponds to the variable that should be used for the x-axis of a “step ahead” plot. In general, this should be a date variable that corresponds to the date which is the “target” of a particular prediction. By default it will look for the "target_date" column, although this could be over-ridden by specifying a different column using the x_col_name argument. In our example data, this column does not exist, so we add it below:

projection_data_a <- dplyr::filter(projection_data, target=="inc hosp",
                            age_group == "0-130",
                            )
projection_data_ab <- dplyr::mutate(
  projection_data_a, target_date = as.Date(origin_date) + (horizon * 7) - 1,
  model_id="teamsam-modelple1")
projection_data_ab <- as_model_out_tbl(projection_data_ab)
head(projection_data_ab)
#> # A tibble: 6 × 10
#>   model_id origin_date target horizon location age_group target_date output_type
#>   <chr>    <date>      <chr>    <int> <chr>    <chr>     <date>      <chr>      
#> 1 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> 2 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> 3 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> 4 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> 5 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> 6 teamsam… 2023-11-19  inc h…       1 US       0-130     2023-11-25  quantile   
#> # ℹ 2 more variables: output_type_id <dbl>, value <dbl>

truth_data <- dplyr::filter(truth_data, target=="inc hosp", age_group=="0-130")
truth_data <- dplyr::mutate(truth_data, time_idx=date)


head(truth_data)
#>   location       date age_group   target    value population   time_idx
#> 1       08 2018-10-06     0-130 inc hosp 0.000000    5661221 2018-10-06
#> 2       47 2018-10-06     0-130 inc hosp 6.757828    6757828 2018-10-06
#> 3       49 2018-10-06     0-130 inc hosp 3.150318    3150318 2018-10-06
#> 4       36 2018-10-06     0-130 inc hosp 0.000000   19519158 2018-10-06
#> 5       35 2018-10-06     0-130 inc hosp 0.000000    2082103 2018-10-06
#> 6       27 2018-10-06     0-130 inc hosp 5.606626    5606626 2018-10-06

Plot

The plotting function requires only 2 parameters:

“Simple” plot

The projection_data and truth_data contain information for multiple locations, and scenarios.

To plot the model projections for the US, No Scenario id :

# Pre-filtering
projection_data_A_us <- dplyr::filter(projection_data_ab, 
                                      location == "US")

# Limit time_idx for layout reason
truth_data_us <- dplyr::filter(truth_data, location == "US", 
                               time_idx < min(projection_data_ab$target_date),
                               time_idx > "2023-06-01")
plot_step_ahead_model_output(projection_data_A_us, truth_data_us)

Facet plot

truth_data <- dplyr::filter(truth_data,
                               time_idx < min(projection_data_ab$target_date) ,
                               time_idx > "2023-06-01")
plot_step_ahead_model_output(projection_data_ab, truth_data, 
                             use_median_as_point = TRUE,
                             facet = "location", facet_scales = "free_x", 
                            facet_nrow = 4, facet_title = "top left", show_legend = FALSE)

Layout update

Multiple layout update are possible:

  • Not showing the truth data in the plot:
plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             plot_truth = FALSE)
  • Change palette color and behavior:

    • The default palette can be changed. All the available palette names are available here:
    RColorBrewer::display.brewer.all()

plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             pal_color = "Dark2")

It is possible to use only blues for all models, by setting the pal_color parameter to NULL. This might be especially useful when used for many models in conjunction with highlighting the ensemble forecast using the ens_name and ens_color argument.

plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             intervals = 0.8,
                             ens_name = "hub-ensemble", ens_color = "black",
                             pal_color = NULL, use_median_as_point = TRUE)

The default blue color can be changed with the one_color parameter

plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             intervals = 0.8, one_color = "orange",
                             ens_name = "hub-ensemble", ens_color = "black",
                             pal_color = NULL, use_median_as_point = TRUE)
plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             interactive = FALSE)

  • Column Names:

The input data frames can have different column names for the date information. In this case, the two x_col_name and x_truth_col_name parameters can be used to indicate the variables that should be mapped to the x-axis.

names(truth_data_us)[names(truth_data_us) == "time_idx"] <- "time"
names(projection_data_A_us)[names(
  projection_data_A_us) == "target_date"] <- "date"
plot_step_ahead_model_output(projection_data_A_us, truth_data_us, 
                             x_col_name = "date", x_truth_col_name = "time")